Do not use AVX2 instructions if the CPU doesn't support it#67
Open
edwintorok wants to merge 2 commits intoOpenVisualCloud:masterfrom
Open
Do not use AVX2 instructions if the CPU doesn't support it#67edwintorok wants to merge 2 commits intoOpenVisualCloud:masterfrom
edwintorok wants to merge 2 commits intoOpenVisualCloud:masterfrom
Conversation
Otherwise we crash with SIGILL on a CPU without AVX (e.g. AMD Opteron 6172). Signed-off-by: Edwin Török <edvin.torok@citrix.com>
To avoid a SIGILL on non-AVX2 CPU use the correct codepath. `ASM_TYPES & PREAVX2_MASK` is always `1`, which chooses the avx2 path and crashes. Use `ASM_TYPES & AVX2_MASK` which should be `0` if the CPU doesn't support AVX2, choosing the non-AVX2 path and avoiding a SIGILL. Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Author
|
Is there a test suite I could run to check whether it also works correctly? |
Contributor
|
Hi @edwintorok, thank you for this PR. |
Author
|
The appveyor CI error seems unrelated: |
Contributor
|
Hi @edwintorok, yes, it is not related. |
Contributor
|
Hi @edwintorok, fyi, before merging this PR, there is some effort to evaluate performance on legacy CPU. With the scarcity of legacy HW, it will take longer for us to move forward with this PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The code for checking CPUID and alternate non-avx2 implementations were all mostly in place, except:
xgetbvwas used before checking CPUID, which caused a SIGILLFixes #9
This code is part of the Phoronix test suite, and I noticed that it crashed on AMD Opteron Processor 6172, now it no longer crashes:
Note: other SVT projects suffer from same problem: